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The cobalt glycyl-leucine dipeptide is a model system for studying the effects of Karplus equation 
calibration, molecular mechanics accuracy, backbone conformation, and thermal motions on the 
measurability of side chain rotational isomer populations. We analyze measurements of 8 vicinal 
coupling constants about the a to /3-carbon and /3 to 7-carbon bonds of the leucine side chain and of 
10 NOESY cross relaxation rates across these bonds. Molecular mechanics and peptide and protein 
crystallographic databases are an essential part of this analysis because they independently suggest 
that the trans gauche + and gauche - trans rotational isomers of the leucine side chain predominate. 
They also both suggest that puckering of the cobalt dipeptide ring system reduces the gauche + 
gauche + rotational isomer population to less than about 10%. At the present ±1 Hz calibration 
accuracy of Karplus equations for vicinal coupling constants, the predominant trans gauche + and 
gauche" trans rotational isomer populations can be measured with about 5% accuracy, but the 
population of the gauche" 1 " gauche" 1 " rotational isomer is probably very near or just below the limit 
of measurability. These estimates also depend upon qualitative assessments of the accuracy of the 
molecular mechanics energy wells. We introduce gel graphics that are ideally suited to presenting 
qualitative error and measurability estimates. 



I. INTRODUCTION 

Comparisons of NMR structures with X-ray structures 
show that vicinal coupling constants accurately measure 
the backbone torsion angles of proteins [2j, l3| ■ In the 
best X-ray structures multiple conformations of apartic- 
ular side chain can be confidently identified 0, [H, |g, 0, Q , 
but population estimates may not be very accurate. The 
populations of the three x 1 rotational isomers of pro- 
tein side chains can be determined from vicinal coupling 
constants if the rotational isomers are assumed to have 
ideal staggered conformations 0, If the side chain 
rotational isomers are not assumed to have ideal confor- 
mations and the amplitudes of the torsion angle fluctu- 
ations of each rotational isomer are also unknown, then 
in general it is only possible to measure rotational iso- 
mer populations as statistical averages over many side 
chains [11]. Even with a very complete set of vicinal 
coupling constants and NOESY cross relaxation rates it 
is difficult to measure the populations of more than two 
rotational isomers of a single side chain [12]. Protein 
NMR and crystallographic data available now or in the 
foreseeable future simply do not have the resolution to 
measure the populations of all possible side chain rota- 
tional isomers at anywhere near the 1% level of accuracy. 
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Understanding protein properties such as fluorescence in- 
tensity decay spectra (l3l [Til [Til [l6l [l7l [l8j. hydrogen 
ion association constants [jjl [2CJ, [2l[ , or global stability 
[22l[23| often requires that an NMR or X-ray structure be 
supplemented with molecular mechanics structure calcu- 
lations of the conformations of a particular side chain. An 
ideal structure determination method would incorporate 
these supplemental molecular mechanics calculations and 
simultaneously ht NMR or crystallographic data. The 
consistency of the data and the incorporated molecular 
mechanics could be judged by existing methods [H, [2f| 
for assessing when a model over-fits the data. In the case 
of measuring side chain rotational isomer populations it 
might be possible to measure the population of one or 
two prominent rotational isomers and assess the mea- 
surability of other molecular mechanically plausible ro- 
tational isomers. In this work we take a simple first step 
in this direction with an analysis of the cobalt glycyl- 
leucine dipeptide model system. This model system has 
the advantages that the vicinal coupling constants and 
NOESY cross relaxation rates can be accurately mea- 
sured on samples with natural isotope abundance and 
that the cobalt dipeptide ring system restrains the dipep- 
tide backbone in a single conformation. 

The two background sections give essential information 
about the conformational analysis of leucine side chain 
rotamers and about the accuracy of the Karplus equation 
coefficients. The experimental section gives the vicinal 
coupling constant and NOESY cross relaxation rate data 
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for the cobalt glycyl-leucine dipeptide. Simple two and 
three rotational isomer models suggested by conforma- 
tional analysis are compared and fit to some of this data. 
The computational results and discussion section exam- 
ines the molecular mechanics energy map, the effect of 
intramolecular thermal motions on calculated NMR cou- 
pling constants and cross relaxation rates, and the Monte 
Carlo probability density functions of the rotational iso- 
mer populations. For the simple cobalt dipeptide model 
system these probability density functions confirm the 
preliminary analysis in the experimental section and ad- 
ditionally suggest that the populations of the remain- 
ing rotational isomers are unmeasurable at present. Our 
analysis shows that while the NMR data is not fit too 
well by the simplest models neither does this data give 
any guidance in selecting among the multitude of models 
with improved fits. 



II. BACKGROUND 

A. Leucine side chain conformational analysis 

The leucine side chains of crystallographic oligopeptide 
and protein structures strongly prefer the trans guache + 
and gauche - trans rotational isomers 0, [27| • Conforma- 
tional analysis predicts the backbone-dependent stability 
of protein side chain conformations and explains the re- 
tainer preferences observed in high resolution crystallo- 
graphic structures of proteins [H, [2i| . This analysis is 
equally applicable to the leucine side chain of the cobalt 
glycyl-leucine dipeptide and leads to the same conclu- 
sions found for leucine side chains in proteins. In cither 
case the predominant rotational isomers of the leucine 
side chain are trans guache + and gauche - trans. The bu- 
tane and sj/n-pentane effects are known from the confor- 
mational analysis of the simple hydrocarbons rt-butane 
and n-pentane. The syn-pentane conformations, which 
are guache + guache - or guache - guache" 1 ", are about 
3.3 kcal/mol higher in energy than the extended trans 
trans conformation. A molecular conformation is said to 
be destabilized by the si/n-pentane effect when one (or 
more) five atom fragments of the molecule are in syn- 
pentane like conformations. The conformational analysis 
of a peptide or protein identifies unfavorable side chain 
conformations primarily by searching for syn-pentane ef- 
fects among all possible n-pentane fragments with the 
C Q -C^ bond in the second or third fragment position. 
When the C^-C 13 and C^-C 7 bonds are in the second and 
third fragment positions, the analysis gives backbone- 
independent rotamer preferences because the fragment 
conformation does not depend on the backbone </> and ip 
angles. When the backbone N-C" or C'-C" bond is in 
the second and the C Q -C^ bond is in the third fragment 
position, the analysis gives backbone-dependent rotamer 
preferences. For a leucine residue there are eight pen- 
tane fragments to consider: four backbone-independent 
fragments of the pattern (N,C / )-C a -C' 3 -CT-(C 51 ) C 52 ), 



and four backbone-dependent fragments of the pat- 
terns (C,_ i , O HNj )-Ni-Cf-Cf -C? or (N i+ iA)-C<- 

Cf-Cf-C/, where leucine is the ith residue and 0-— HN, 
is an assumed hydrogen bond acceptor of HNj. A very 
clear Newman projection diagram of these eight frag- 
ments is shown in Fig. 2 of Ref. [28|. Of the nine possible 
leucine rotamers only two, trans guache + and gauche - 
trans, have no backbone-independent sj/n-pentane in- 
teractions, six have one interaction, and one, guache + 
gauche - , has two such interactions. Because the iden- 
tical backbone-independent fragments are present in the 
cobalt dipeptide and proteins, the backbone-independent 
conformational analysis for proteins applies equally to 
the cobalt dipeptide. 

The conformational analysis of backbone-dependent 
rotamer preferences is important because the cobalt 
dipeptide backbone forms two approximately planar 
chelate rings with <j> and ip angles differ somewhat from 
the angles most commonly found in protein structures. 
To apply backbone-dependent conformational analysis to 
the leucine side chain of the cobalt dipeptide (Fig.[l| sim- 
ply note that leucine is the second residue and substitute 
the atom names Co, O* 1 , and O 42 for the names 0---HN,, 
Oi, and Nj_|_i in the above five atom fragments, where O* 1 
is the terminal carboxyl oxygen bonded to cobalt and 
O* 2 is the uncomplexed carboxyl oxygen. (The reverse 
substitution of O* 1 and O* 2 inconveniently generates the 
dipeptide from a protein with a very unlikely backbone 
conformation that has colliding amide groups.) With this 
identification the leucine backbone torsion angles 02 and 
-02 are both near 180 degrees. If 02 = —174.74 degrees 
and leucine x 1 is gauche - , there is a syn-pentane in- 
teraction between the glycine carbonyl carbon and the 
leucine C 7 (Fig. [1] middle). If ip2 = 174.74 degrees and 
leucine x 1 is trans, there is a syn-pentane interaction be- 
tween the uncomplexed terminal oxygen and the leucine 
C 7 (Fig. [1] top). Because 02 and ip2 of the cobalt dipep- 
tide are both near these critical angles, syn-pentane ef- 
fects could destabilize both of the leucine side chain trans 
guache + and gauche - trans rotamers, which are observed 
in most crystallographic structures and preferred by the 
backbone-independent conformational analysis. 

Crystallographic studies of copper and cobalt dipep- 
tides and the vicinal coupling constants about both N-C a 
bonds of the cobalt glycyl-leucine dipeptide suggest that 
the 02 and ip2 torsion angles of the cobalt glycyl-leucine 
dipeptide depart from 180 degrees by as much as 10 or 
20 degrees. In the cryst allog raphic structures of cobalt 
glycyl-glycine dipeptides [30| and copper dipeptides [3l| 
the chelate ring conformations vary over a wide range. 
The peptide backbone atoms are typically displaced by 
up to 0.1 or 0.2 angstrom from the mean plane of the 
chelate rings and the angles between 3-atom segments of 
the chelate rings are typically 5 or 10 or perhaps as large 
as 20 degrees. Both these measures of chelate ring puck- 
ering imply backbone torsion angles of 180±10 degrees 
with maximum deviations from 180 degrees of no more 
than about 20 degrees. The variability of the chelate 
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FIG. 1: Stereo views of three of the nine possible rota- 
tional isomers of the cobalt dipeptide leucine side chain: top, 
trans gauche + ; middle, gauche - trans; and bottom, gauche 4 
gauche + . The three rotational isomers each have a single syn- 
pentane interaction, which occurs in the pentane fragment 
shown as a grey ball and stick representation with exager- 
ated stick diameters and labeled atoms. The cobalt atom is 
the labeled central black sphere. The gauche + trans rota- 
tional isomer (not shown) also has only one syn-pentane in- 
teraction, but turns out to have a slightly higher energy than 
the gauche" 1 " gauche + rotational isomer. The five remaining 
rotational isomers (not shown) all have two sj/n-pentane in- 
teractions and are several kcal/mol higher in energy. Note 
that the sj/n-pentane interactions of the trans gauche" 1 " and 
gauche" trans rotational isomers {top and middle) depend on 
X 1 and in addition the leucine backbone if) or <f> torsion angles, 
respectively. These are thus backbone-dependent sj/n-pentane 
interactions. The sj/n-pentane interaction of the gauche 4 " 
gauche 4 " rotational isomer [bottom) depends on the x 1 an d 
X 2 torsion angles and is a backbone-independent. The trans 
gauche 4 and gauche - trans rotational isomers have the lowest 
energies because the backbone-dependent syn-pentane inter- 
actions can be relieved by chelate ring puckering. 



ring conformations is thought to arise from intermolecu- 
lar contacts within the crystall. Even if a crystallographic 
structure of the cobalt dipeptide were available, no reli- 
able predictions about the solution conformation of the 
chelate rings could be made because the conformational 
distortions caused by intermolecular contacts are not well 
understood. In various DMSO plus D2O mixtures mea- 



surements of the four H-N-C a -H vicinal coupling con- 
stants about the glycine N-C a bond of the cobalt glycyl- 
leucine dipeptide show a rotation about this bond of 
— 10 to —20 degrees [32[. This rotation angle is relative 
to an eclipsed substituent atom geometry about the N- 
C a bond and implies a puckered amino-peptidato chelate 
ring. Though this puckering gives no direct information 
about the leuine 02 and 1^2 torsion angles, the presence 
of puckering in solution shows that the intermolecular 
contacts within a crystall are not the only cause of puck- 
ering. The C'-N-C a -H vicinal coupling constant about 
the N-C Q bond of the cobalt glycyl-leucine dipeptide is 
about 0.3 Hz larger than the same coupling constant of 
the cobalt glycyl-glycine dipeptide [32j . (In Ref. HI the 
first atom of this coupling constant is mistakenly labeled 
C a rather than C.) This coupling constant difference im- 
plies that the cobalt dipeptide 02 torsion angle is about 
-170 ± 10 degrees. 



Simple inspection of backbone-dependent rotamer li- 
braries suggests that a 20 degree departure from the 
backbone torsion angles of planar chelate rings can di- 
minish or even eliminate the gauche 4 " x 1 rotamer pref- 
erence of the leucine side chain. The proportions of 
gauche - , gauche 4 ", and trans x 1 side chain rotamers 
change fairly dramatically for both 30 and 20 de- 
gree backbone angle increments within the backbone- 
dependent rotamer libraries of Dunbrack and Karplus 
[28l I29I ]. The rotamer libraries seem to show that 
backbone-independent interactions are slightly more im- 
portant than backbone-dependent interactions. In the 
backbone-dependent rotamer library for straight side 
chains [28j |. which does not include leucine, the gauche - 
and trans x 1 rotamers appear with a combined fre- 
quency of 20 to 30% in the two 30 by 30 degree square 
cells adjacent to the point 0, tp = 180 degrees, even 
though backbone-dependent sj/n-pentane interactions fa- 
vor gauche + x 1 rotamers. A similar trend seems to hold 
throughout the backbone-dependent rotamer library, x 
rotamers excluded by syn-pentane interactions appear 
with diminished probability rather than being completely 
excluded. In contrast, the backbone-independent syn- 
pentane interactions of the leucine side chain exclude the 
gauchc + x 1 rotamer with striking completeness. In crys- 
tallographic structures on which the rotamer library is 
based less than 2% of the leucines are gauche 4 " x 1 ro- 
tamers [IH. The exclusion of leucine gauche 4 " x 1 ro- 
tamers as well as gauche - x 2 rotamers seems to hold 
over all known classes of protein backbone structures [27| | . 
The crystallographic and NMR evidence cited in the last 
paragraph suggests that the 02 and "02 torsion angles of 
the cobalt dipeptide both depart from 180 degrees by as 
much as 10 or 20 degrees. This is more than enough 
to relieve the backbone-dependent si/n-pentane interac- 
tions with the leucine C 7 and diminish the backbone- 
dependent preference for gauche 4 " x 1 rotamers of the 
leucine side chain. 
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B. Accuracy of Karplus equation calibration 

Theoretical calculations show that the H-C-C'-H' vic- 
inal proton coupling constant depends on the torsion an- 
gle <^(H,C,C,H') around the C-C bond, the electronega- 
tivity and orientation of substituent groups on the C and 
C atoms, the bond angles 0(H,CC) and 0(C,C',H'), and 
the length of the C-C bond [33|. The same symbol <f> 
serves here for the torsion angle between the vicinally 
coupled spins and elsewhere for the protein backbone 
torsion angle, but the meaning should always be clear 
from the context. The theoretical dependence of the cou- 
pling constant on the torsion angle is approximated by a 
Karplus equation, which is often written in the form 

3 JO) = A cos 2 </> - B cos + C, (1) 

where <j> is the torsion angle around the C-C bond. For 
peptide side chain vicinal coupling constants the tor- 
sion angle <fr is equal to the side chain torsion angle 
to within a phase shift that is approximately an inte- 
ger multiple of 120 degrees. The Karplus equation for 
a H-C-C'-C" heteronuclear vicinal coupling constant 
also has the same form, where the torsion angle is now 
</>(H,C,C,C") around the same C-C bond. The accu- 
racy of theoretical coupling constant calculations or of 
the Karplus equation fit to such calculations is at best 
around ±1 Hz. A more accurate Karplus equation is 
obtained by adjusting the coefficients to fit experimen- 
tal coupling constant data. The greatest improvement 
occurs when the Karplus equation coefficients are cali- 
brated for a four atom fragment with specific functional 
groups substituted in a specific orientation on the central 
two atoms. For example, the error in the coupling con- 
stants predicted by the Karplus equation calibrated for 
the protein H-N-C^-C 3 heteronuclear coupling constant 
is ±0.25 Hz, as estimated by the RMS difference between 
the Karplus curve and the fit experimental data If 
a Karplus equation that is calibrated for a four atom 
fragment with specific substituent groups is applied to 
the same four atom fragment with different substituent 
group chemistry or orientation, then the errors in the 
predicted coupling constants are dramatically increased. 
Similar large errors in the predicted coupling constants 
result if the Karplus equation is fit to a collection of ex- 
perimental coupling constants that are all measured for 
the same fixed four atom fragment, but with a variety of 
functional groups substituted in a variety of orientations 
on the two central atoms. For one data set of over 300 
experimental measurements of the H-C-C-H' coupling 
constant in about 100 conformationally rigid compounds, 
which are largely 6-membered rings with holding groups, 
the RMS difference between the fit Karplus curve and 
experimental data is 1.2 Hz |34 |. In this data set car- 
bon and oxygen are the most frequent substituent atoms 
bonded to the central two carbon atoms of the H-C- 
C-H' fragment, while nitrogen, sulfur, halogen, silicon, 
and selenium substituent atoms occur in smaller num- 
bers. These two examples span the accuracy range of 



most calibrated Karplus equations, that is, errors in pre- 
dicted coupling constants are in the range ±0.25 to ±1 
Hz. For a specific substituent group chemistry and ori- 
entation the error may be around ±0.5 Hz or even as low 
as ±0.25 Hz in favorable cases. For small to moderate 
variations in substituent group chemistry or orientation 
the error probably is in the range ±0.5 Hz to ±1 Hz. 

Studies of the vicinal coupling constants of peptides 
and closely related compounds suggest that the above 
generalizations about the accuracy of calibrated Karplus 
equations apply to the vicinal coupling constants about 
the C^-C 3 and C*-C^ bonds of the cobalt dipeptide 
leucine side chain. Simple information about the effect of 
substituent group chemistry comes from alanine and its 
analogues, which have a single H-C a — C^-H coupling con- 
stant because of the three-fold symmetry of the methyl 
side chain. The experimental H-C-C-H coupling con- 
stant of ethane is 8.0 Hz [Hj], the H-C Q -C^-H coupling 
constant of the alanine dipeptide is 7.3 Hz [36[, and the 
H— C a -C^-H coupling constant of the amino acid alanine 
remains almost exactly 7.3 Hz over the pH range of 0.5 
to 12.5 [37J. Replacing two protons on one ethane carbon 
atom with one carbon and one nitrogen substituent group 
drops the coupling constant by 0.7 Hz; changing the ni- 
trogen substituent group electronegativities through the 
range ammonium > acetamide > amide and changing the 
carbon substituent group through the range carboxyl > 
N-methylamide > carboxylate does not change the cou- 
pling constant at all. But there are many counter ex- 
amples to this seeming insensitivity to substituent group 
chemistry. The H-C-C-H coupling constant of propane 
is 7.3 Hz and of isopropylamine is 6.3 Hz [35;]. Replacing 
one ethane proton with one carbon substituent group al- 
ready drops the coupling constant to the value observed 
for alanine, which has an additional nitrogen substituent 
group, and replacing a second proton on the same car- 
bon atom with a nitrogen substituent group, making the 
substituted carbon equivalent to the alanine a-carbon, 
drops the coupling constant by an additional 1.0 Hz. The 
H— C a — C^-H coupling constants of various alanine dipep- 
tide derivatives are in the range 6.9 to 7.3 Hz [3||, which 
shows that even substituent group changes one peptide 
bond removed from the a-carbon can change the coupling 
constant about the C a -C^ bond by at least 0.4 Hz. 

The /3-carbon atoms of all the other amino acids lack 
the three-fold symmetry of the alanine /3-carbon. The 
above examples of alanine mehtyl proton coupling across 
the C^-C 3 bond probably underestimate the coupling 
constant variation due to a-carbon substituent group 
chemistry and totally ignore the effect of /3-carbon substi- 
tution. For leucine two H-C Q -C^-H coupling constants 
between the a and /3-protons and four heteronuclear cou- 
pling constants between the amide nitrogen and carbonyl 
carbon and /3-protons are usually measurable. The sim- 
plest models for these vicinal coupling constants about 
the C*-C^ bond assume ideal gauche or trans torsion an- 
gles between the coupled spins and have four parameters: 
the populations of two of the three x 1 rotational isomers 



5 



and the gauche and trans coupling constants. The het- 
eronuclear N-C Q -C' 3 -H trans coupling constant of the 
leucine cation apparently decreases by 0.6 Hz when the 
cation is converted into the anion [38|. The effect of a- 
carbon substituent chemistry on the coupling constants 
about the C"-C^ bond is also seen in the 1-substituted 
derivatives of 3,3-dimethylbutane, which are analogues of 
the amino acid leucine with the a-carbon and side chain 
intact and with various replacements for the amine and 
carboxylate groups. Both the gauche and trans coupling 
constants of these analogues vary over the range of 0.7 Hz 
[39l ] . Furthermore, this same study found a 1 Hz differ- 
ence in the average gauche coupling constant depending 
on whether the 1-substituent was gauche or trans to the 
coupled proton on the second carbon. This suggests that 
two separate Karplus equations are required for the two 
/3-protons of leucine. Substituent orientation effects re- 
quire two different Karplus equations f or p redicting the 
/3-proton coupling constants of proline [40(. In a similar 
way the electronegativity corrections to the 3 Jh q h>(</0 
Karplus equation for leucine probably depend on the ori- 
entation of the substituent groups with respect to both 
the H Q and ff 3 protons fill ]. 

The existing calibrations of the Karplus equations for 
vicinal coupling about the C a -C" bond suffer from sev- 
eral sources of error. Most assume that for each a-carbon 
bonded atom one Karplus equation predicts the cou- 
pling of this atom to both /3-protons. The calibrations 
are done with sets of model compounds that have nor- 
mal or sometimes rather far from normal peptide back- 
bone chemistries and that have a range of standard and 
nonstandard amino acid side chains. Errors arise be- 
cause the set of model compounds is too varied or be- 
cause none of the model compounds closely match the 
molecule of interest, whether it be a protein or as in this 
study the cobalt dipeptide. Model compounds such as 
2,3-substituted bicyclo[2.2.2]octanes [H,[42], gallichrome 
[43| . a-amide-7-butyrolactones [44|], differ from standard 
proteins in both backbone and side chain structure. The 
match to the molecule of interest may depend on a choice 
of coupling constants about several different bonds of 
the model compound. Because the gallichrome backbone 
and ornithyl side chains out to the /3-carbon have essen- 
tially the same structure as standard proteins, the C Q - 
C 3 bonds of gallichrome are fairly well matched to the 
Qa_Q/3 |-, on( j s f proteins. The substituent groups on 
the (3 and 7-carbons are obviously quite different from 
those on the a-carbon and coupling about the C^-C 7 
and C 7 -C' 5 bonds is somewhat different from that about 
the C Q -C^ bond. If coupling constants about all three 
bonds are chosen to calibrate the Karplus equation for 
coupling about the C Q -C^ bond, then the gallichrome 
model compound is not a very good match to proteins. 
Model compounds such as cyclo(triprolyl) peptide [36| . 
an asparaginamide dipeptide, oxytocin cystine-1 and 6, 
alumicrocin |44| . match the backbone of standard pro- 
teins, but the side chains may differ from a specific amino 
acid side chain of interest. 



The residuals between the Karplus curve and the cali- 
bration data set are perhaps the best available indication 
of the accuracy of a Karplus equation calibration. How- 
ever, these residuals generally underestimate the errors 
that occur when the Karplus equation is then applied to 
predict the vicinal coupling constants of a particular side 
chain of interest. Fischman et al. calibrate the 3 Jc H/j(^>) 
Karplus equation by fitting the three Karplus coefficients 
to 4 coupling constants measured on two bicyclo-octanes 
38] . For this fit the RMS residual per degree of freedom 
is 0.13 Hz. Due to the small calibration data set this tiny 
observed residual is a completely unreliable estimate of 
the true residual. Kopple et al. calibrate the 3 Jn a Ht]{4>) 
Karplus equation by fitting to 10 coupling constants mea- 
sured on seven model compounds [361 ] . For this fit the 
RMS residual per degree of freedom is 0.47 Hz. The 
data set is just large enough to give a reliable residual 
estimate, but as discussed in the previous paragraph the 
model compounds may not be a very good match to pro- 
teins. DeMarco et al. calibrate the 3 Jn a n l 3(4>) Karplus 
equation by fitting 30 coupling constants measured about 
the C Q -C' 3 , C^-C 7 and C 7 -C 5 bonds of the ornithyl side 
chains of gallichrome (43|. For this fit the RMS resid- 
ual per degree of freedom is 0.92 Hz. This fairly large 
residual is apparently the result of fitting the coupling 
constants about all three side chain bonds. The errors in 
this calibration may be even larger than ±1 Hz because 
fitting the coupling constants about all three side chain 
bonds makes the gallichrome model compound a poor 
match to the C Q -C' 3 bond of proteins and may add addi- 
tional bias error to that suggested by the residuals. Cung 
and Marraud calibrate the JhcH^^) Karplus equation 
by fitting the three Karplus coefficients and eight angle 
parameters to 16 coupling constants measured on five 
model compounds [4J]. For this fit the RMS residual 
per degree of freedom is 0.49 Hz. Note Cung and Mar- 
raud arrive at a standard deviation of half this value by 
computing a straight RMS average of the 16 residuals 
rather than by averaging over the 5 degrees of freedom 
actually present. Though the five model compounds used 
for this calibration are well matched to the C Q -C^ bond 
of proteins, the errors of this calibration are likely to 
be substantially larger than suggested by the residuals 
because the model compound torsion angles are not esti- 
mated from crystallographic structures or by molecular 
mechanics calculations. 



Considering the wide variety of model compounds, the 
single calibration for both /3-protons, the observed residu- 
als, and uncertainties in the model compound structures, 
it seems extremely unlikely that the errors in the calibra- 
tion of the Karplus equations for vicinal coupling about 
the C a -C^ and C^-C 7 bonds are significantly less than 
±1 Hz, whether the molecule of interest is a standard 
protein or peptide or as here the glycyl-leucine dipeptide 
complexed to cobalt. 
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TABLE I: Proton assignments. Shifts are in ppm. The atom 
H' 31 and atoms H* 51 are pro-R and the atom H' 32 and atoms 
H 52 are pro-S. 





S 




S 




4.1710 


H 7 


1.6582 


H /31 


1.8125 




0.8700 


H /32 


1.5440 


R S2 


0.8200 



III. EXPERIMENTAL RESULTS 

The proton assignments in Table U are model depen- 
dent. These assignments depend on our assumption that 
the population of the leucine side chain rotational iso- 
mers with a gauche + x 1 torsion angle is small compared 
to the population of rotational isomers with gauche - and 
trans x 1 torsion angles. Without this assumption an un- 
ambiguous assignment is not possible. The conventional 
approach to assigning the /3-protons examines the vic- 
inal coupling constants about the G a -C^ bond and ex- 
ploits the fact that a weak coupling is expected for syncli- 
nal spins and a strong coupling for antiperiplanar spins. 
When the leucine side chain x 1 torsion is gauche - the 
atoms H" and IF 1 are antiperiplanar and the atoms H Q 
and H' 32 are synclinal and when x 1 is trans these angular 
magnitudes are reversed, that is H^ 31 is synclinal and H' 32 
is antiperiplanar (45j. Thus the 3 </H Q Hf3 coupling con- 
stant does not help with H' 3 assignment, but is very help- 
ful in determining the ratio of gauche - to trans popula- 
tions once this assignment is known. The alternating syn- 
clinal antiperiplanar geometries of the gauche - and trans 
rotational isomers produces a conjugate 3 Jh q h^ coupling 
pattern that is diagnostic of the absence of gauche + x 1 
rotational isomers. The average of the two 3 Jh q H/3 cou- 
plings is independent of the rotational isomer popula- 
tions and the coupling ratio ( J Ha Kpi-J sc )/( Jn a Hff2~ J sc ) 
is equal to the gauche - over trans population ratio. For 
ideal geometries the Karplus equation predicts that the 
synclinal coupling J sc = A/4— B/2+C and that the aver- 
age coupling is 5 A /8 + B/4 + C. A very similar situation 
occurs with the 3 Jh/3H 7 coupling constant. Leucine side 
chain rotational isomers with a gauche - x 2 torsion an- 
gle are virtually excluded by backbone-independent syn- 
pentane interactions [28[. When the x 2 torsion angle is 
gauche + the atoms H' 31 and H 7 are antiperiplanar and 
the atoms H' 32 and H 7 are synclinal and when x 2 is trans 
these angular magnitudes are reversed. This coupling 
is again helpful with populations but not assignments 
and the same conjugate coupling pattern is now diag- 
nostic of the absence of gauche - x 2 rotational isomers. 
As noted in the introduction the x 1 and x 2 torsion an- 
gles are highly correlated. With only the trans gauche + 
and gauche - trans isomers populated trans x 1 implies 
gauche + x 2 and gauche - x 1 implies trans x 2 - This cor- 
relation produces a doubly conjugate 3 JhqH^ and 3 Jh^h 7 
correlation pattern. When the trans gauche" 1 " rotational 
isomer predominates the /31-proton couples weakly to the 



TABLE II: Experimental NMR data. The NOESY cross re- 
laxation rate units are s -1 and the vicinal coupling constant 
units are Hz. The standard deviations are estimated as de- 
scribed in the experimental methods section. A plus rather 
than plus and minus sign is placed between a zero value and 
standard deviation solely to indicate the fact that the mea- 
sured quantities are theoretically nonnegative. 



R-Ha H/31 


0.038 


± 


0.010 


R-Ha H,32 


0.041 


± 


0.010 




0.013 


± 


0.003 


Rh/32H 7 


0.040 


± 


0.020 


Rhq H«l 


0.0 


+ 


0.005 


Rhc.H«2 


0.014 


± 


0.003 


Rh/sihsi 


0.023 


± 


0.010 


Rh/31H42 


0.0 


+ 


0.002 


Rh/32H,51 


0.0 


+ 


0.002 


RH/32H32 


0.022 


± 


0.010 


3 T 

JHa H^l 


7.381 


± 


0.042 


3 T 

Jh H32 


4.549 


± 


0.034 


JH()i h t 


4.999 


± 


0.034 


3 T 

J H/32 H7 


8.070 


± 


0.059 


3 T 


3.0 


± 


2.0 


3 JCq H 7 


5.0 


± 


2.0 


3 T 


5.3 


± 


0.5 


3 T 

J C H02 


4.2 


± 


0.5 



a-proton and strongly to the 7-proton and the /32-proton 
couples strongly to the a-proton and weakly to the 7- 
proton. When the gauche - trans isomer predominates 
these coupling strengths are all reversed. If either one of 
these leucine side chain rotational isomers predominates 
then the /3-proton assignment can be made by inspec- 
tion of the 3 Jch,3 heteronuclear coupling constants (4f| 
because the leucine carboxyl carbon and both /3-protons 
are synclinal when the leucine side chain x 1 torsion is 
gauche - and only the carboxyl carbon and /32-proton 
are synclinal when x 1 is trans. 

The experimental 3 «/h q H/3 and 3 Jh/3H t vicinal coupling 
constants (Table HI)) show the doubly conjugate pattern 
expected for leucine side chains, that is strong - weak 
and weak - strong. The approximately 3 Hz difference 
between the weak and strong couplings indicates that 
both the trans gauche" 1 " and gauche - trans leucine side 
chain isomers are significantly populated and that the 0- 
proton assignment is best determined by comparing the 
goodness-of-fit of the two alternative assignments. For 
the preliminary /3-proton assignment in this section it 
is adequate to fit only the 3 Jh q H/3 and 3 Jch/3 coupling 
constants. The resulting two rotational isomer model, 
see methods, does not have any dependence on the x 2 
torsion angle; nevertheless, throughout this paragraph 
we maintain the assumption of highly correlated x 1 and 
X 2 torsion angles and continue to refer to these two ro- 
tational isomers as trans gauche" 1 " and gauche - trans. 
The goodness-of-fit of the simple two rotational isomer 
model is 2 x 10 -2 for the /3-proton assignment in Table 
|T]and 2 x 10 -3 for the alternative assignment. The bet- 
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ter fit gives population estimates of 39% trans gauche" 1 " 
and 61% gauche - trans with an uncertainty of ±9%. 
These population estimates fall in the gray area between 
predominantly gauche - trans and approximately equal 
mixture of both conformations. On either side of this 
gray area the assignment made by inspection agrees with 
that obtained by fitting the experimental 3 Jh»h,3 and 
3 Jcnp coupling constants. Suppose gauche - trans pre- 
dominates. Then the x 1 torsion angle is gauche - , H Q and 
are antiperiplanar, the carboxyl carbon and both (3- 
protons are synclinal, and the assignment in Table |T] is 
correct because the 3 Jh q h^i coupling is stronger than the 
3 >/hc,h,32 coupling and both 3 Jch^ couplings are fairly 
weak. On the other hand suppose the two conformations 
are approximately equally mixed. Then the carboxyl car- 
bon and the /31-proton are synclinal in one conformation 
and antiperiplanar in the other, but the carboxyl carbon 
and the /32-proton are synclinal in both conformations, 
and the assignment in TableUis again correct because the 
3 >^C'H/3i coupling is stronger than the 3 Jch^2 coupling. 

The goodness-of-fit of the simple two rotational isomer 
model is only 2 x 10 -2 because this model predicts a high 
average 3 Jh q Hp coupling constant and too low a 3 Jq> H/32 
coupling constant. As noted above the average of the two 
3 JhcHi3 coupling constants is 5A/8+B/4+C for ideal ge- 
ometry. Karplus coefficients for this coupling [H, [H, |45| 
give average values ranging from 8.1 to 8.7 Hz. These 
predicted values must be compared with 6.0 Hz, which is 
the average of the two experimental 3 Jh„H3 couplings in 
Table IhI One explanation for this difference is that there 
is a small population of rotational isomers with gauche + 
X 1 torsion angle. This would reduce the predicted aver- 
age coupling constant because both /3-protons are syn- 
clinal to the a-proton when x 1 is gauche" 1 ". What is im- 
portant is not the magnitude in Hertz of the difference 
between the average predicted and experimental 3 JhoH/3 
couplings, but the standard deviation, that is, the ratio 
of this difference over the estimated error. The one and 
one-half standard deviation difference found here is not 
improbably large and reflects our estimate of the errors in 
the Karplus equation calibration, see background section, 
and of the errors due to the assumption of ideal geometry. 
In view of the known improbablity of leucine gauche + x. 1 
rotational isomers, again see background section, these 
last two sources of error are a more likely explanation of 
the difference. 

The above explanation of the difference between the 
average predicted and experimental 3 Jh q H/3 coupling 
constants is even more plausible in view of a similar dif- 
ference found between the average predicted and exper- 
imental 3 Jh/3H t coupling constants. Though the dipep- 
tide backbone conformation leaves some room for doubt 
about the complete absence of x 1 is gauche + conforma- 
tion, the evidence from crystallographic studies and con- 
formational analysis is very good that there is at most 
a very small population of rotational isomers with a 
gauche - x 2 torsion angle. Also the cobalt complex with 
the dipeptide backbone should have relatively little effect 



on the 3 Jh/3H^ coupling constant. For ideal geometry the 
average of the two 3 Jh/?h t coupling constants is given by 
the same expression as for the 3 Jh q H/3 average. Karplus 
coefficients for the sec-butyl coupling 35| give an average 
value of 8.5 Hz and coefficients corrected for substituent 
electronegativity as suggested by Pachler (Ref. (46, Eq. 2 
and Table 4) give an average value of 8.1 Hz. The average 
of the two experimental couplings in Table HT1 is 6.5 Hz. 
Error in the Karplus coefficient calibration and perhaps 
some departure from ideal geometry seem to be the only 
plausible explanation is this difference. This supports our 
view that overall errors of one to two Hertz are entirely 
possible. The two rotational isomer model also predicts 
that the 3 Jch^2 coupling constant is A/ A — B/2 + C 
because the /32-proton is always synclinal to the leucine 
carboxyl carbon when rotational isomers with a gauche + 
X 1 torsion angle are excluded. The predicted coupling 
Q is 1.4 Hz and the observed is 4.2 Hz (Table HJ). A 
small gauche + x 1 population could make up much of this 
two standard deviation difference, but we again favor the 
explanation that the Karplus calibration is not very ac- 
curate. 

The 5-proton assignments in Table Q] follow from the 
pattern of Rh^hs cross relaxation rates in Table [TT1 These 
assignments are also model dependent. For several mod- 
els the (5-proton assignment is unambiguous once the 
/3-proton assignment is selected (results not presented); 
however, to keep things simple we again assume that only 
the trans gauche + and gauche - trans leucine side chain 
rotational isomers are significantly populated. For both 
these rotational isomers the H 01 to H 51 and H' 32 to H 52 
distances are 2.8 to 2.9 angstroms. The H 91 to H 52 and 
H' 32 to H S1 distances are 2.9 and 4.0 angstroms for the 
trans gauche" 1 " rotational isomer and reverse order to 4.0 
and 2.9 angstroms for the gauche - trans isomer. The 5- 
proton assignments in Table U produce the strong - weak 
- weak - strong pattern observed in the experimental 
relaxation rates in Table |H] 

An unambiguous /3-proton assignment is not possible if 
an arbitrarily large population of gauche" 1 " x 1 rotational 
isomers is allowed. We have repeated the above least 
squares fit to the experimental 3 JhcH^ and 3 Jq> h^2 cou- 
pling constants, while allowing rotational isomers with 
gauche - , gauche" 1 ", and trans x 1 torsion angles. For 
the /3-proton assignment in Table |T] the goodness-of-fit 
of this three rotational isomer model, see methods, rises 
to 29% and that of the alternative /3-proton assignment 
rises to 94%. By this criterion either assignment is now 
acceptable. The population of rotational isomers with 
gauche + x 1 torsion angles is 36% for the selected as- 
signment and 51% for the alternative. Either of these 
gauche + populations seems unacceptably high. In any 
event the presently available models are not able to mean- 
ingfully predict the gauche + x 1 population. The experi- 
mental data can be satisfactorily explained by a two ro- 
tational isomer model, which excludes gauche" 1 " x 1 r0 ~ 
tational isomers. The unambiguous assignment of the 
(3 and (5-protons probably must await the preparation of 
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cobalt dipeptide with stereoselectively deuterated leucine 
side chains [47j . 



IV. COMPUTATIONAL RESULTS AND 
DISCUSSION 

A. Chelate ring conformation 

As discussed in the background section, conforma- 
tional analysis predicts that gauche + x 1 rotational iso- 
mers of the leucine side chain are favored in the absence 
of chelate ring puckering. Crystallographic and NMR 
evidence shows that the cobalt dipeptide chelate rings 
do pucker and that the 4>2 and ^2 torsion angles depart 
from 180 degrees by as much as 10 or 20 degrees. Sim- 
ple inspection of backbone-dependent rotamer libraries 
suggests that this departure is large enough to reduce 
the population of gauche + x 1 rotational isomers to fairly 
low levels. To obtain a sharper picture of the depen- 
dence on backbone conformation of rotamer preferences 
we have constructed a rotamer library for a region-of- 
interest around the special point in (f> X if) space where 
gauchc + x 1 rotamers are most favored, that is, the point 
with coordinates <fi = —175 and ip — 175 degrees. This 
rcgion-of-interest rotamer library differs from previous 
backbone-dependent rotamer libraries [28], |29| only in 
that a limited region of (f> X tjj space is divided into an- 
nular disks around the special point instead of dividing 
the entire 4> x V* space into a grid of square cells. Our 
rcgion-of-interest rotamer library is constructed from a 
list of backbone and side chain torsion angles of 7085 
leucine residues on 445 nonhomologous (that is with less 
than 50% sequence identity) protein chains from a re- 
cent Brookhaven Protein Database of structures with a 
resolution of 2.0 angstroms or better. Backbone angles 
in the region-of-interest are not very common in protein 
structures. There are only 0, 13, 28, 72, 112, and 251 
of the leucine residues with backbone <f> and ip angles in 
the 6 annular shells with a width of 10 degrees and that 
have outer radii of 10, 20, 30, 40, 50, and 60 degrees. 
There are 11 residues with a gauche" 1 " x 1 torsion angle 
out of the 13 (92%) with backbone torsion angle in the 
10 to 20 degree annulus, 21 of 28 (75%) in the 20 to 30 
annulus, 29 of 72 (40%) in the 30 to 40 annulus, 10 of 
112 (9%) in the 40 to 50 annulus, and 7 of 251 (3%) in 
the 50 to 60 annulus. The region-of-interest rotamer li- 
brary shows a dramatic drop-off of gauche + x 1 leucine 
side chain rotational isomers when the backbone torsion 
angle is beyond the 20 to 30 degree annulus. Note that 
X 2 torsion angles of most of the gauche" 1 " x 1 rotational 
isomers in this region-of-interest are also gauche" 1 ". 

The conformational statistics of leucines in protein 
database structures show clearly that side chain rotamer 
preferences are highly sensitive to the backbone confor- 
mation, especially near the <\> and ip angles of the cobalt 
dipeptide backbone. Analysis of the molecular mechan- 
ics energy map over x 1 x X 2 torsion space of the leucine 




x 1 



FIG. 2: Molecular mechanics energy map for rotational iso- 
merization of cobalt dipeptide leucine side chain. Contour 
levels are dashed, 1, 3, 5, 7, 9; solid, 2, 4, 6, 8, 10 kcal/mol. 
Zero corresponds to —39.4 kcal/mol. The nine rotational iso- 
mers are labeled at the position of their energy well minima. 
The background shading shows the energy well boundaries 
and the torsion space regions for averaging the NOESY cross 
relaxation and vicinal coupling constants. 



side chain (Fig. [2]) and of the backbone conformation 
of the energy minimized dipeptide structures suggests 
that the cobalt dipeptide chelate rings do indeed pucker 
enough to allow gauche - and trans x 1 rotational isomers 
to predominate. The gauche - trans energy well has the 
lowest energy minimum, which we assign the value ex- 
actly kcal/mol, and the trans gauche + well is only 
0.1 kcal/mol higher. The energies of the three energy 
well minima with the x 1 torsion angle gauche" 1 " are 6.4 
kcal/mol for gauche + gauche - , 2.1 kcal/mol for gauche + 
gauche + , and 2.6 kcal/mol for gauche + trans. All other 
well minima have energies higher than 2.9 kcal/molc. 
The molecular mechanics energy map prediction matches 
the backbone-independent conformational analysis result 
that the trans gauche" 1 " and gauche" trans leucine side 
chain rotational isomers are the most stable. The back- 
bone torsion angles of the minimized structures at the 
energy minimum grid point of the gauche - trans en- 
ergy well are 4> = —162 and ip = 167 degrees and of 
the trans gauche + well are <\> — —166 and ip = 168 
degrees. These leucine backbone torsion angles spec- 
ify a point in 4> x ip space that is 15 and 11 degrees 
from the point where gauche" 1 " x 1 rotamers are most fa- 
vored, that is, the point with coordinates 4> = —175 and 
tp = 175 degrees. The backbone torsion angles at the 
energy minimum grid point of the three gauche + x 1 ro- 
tational isomers are all in the ranges —176 < <\> < —174 
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and 178 < ip < 180 and are all within 4 to 6 degrees of 
the point with both <f> an d ip — 180 degrees. This seems 
to confirm that the backbone torsion angles of the trans 
gauche + and gauche~ trans rotational isomers are in- 
deed adjustments from cobalt dipeptide chelate ring pla- 
nar geometry that accommodate unfavorable backbone- 
dependent syn-pentane interactions. To eliminate these 
backbone-dependent interactions the backbone confor- 
mation apparently need adjust only by an angle of 10 
to 15 degrees in <fi x ip torsion space, which is about half 
that suggested by the region-of-interest rotamer library. 

To definitively establish the amount of chelate ring 
puckering and its influence on leucine side chain popu- 
lations will require more extensive molecular mechanics 
calculations and more reliable energy map error estimates 
than presented here. Such molecular mechanics studies 
are important because as we have already emphasized in 
the experimental section the NMR data by itself does not 
give an unambiguous assignment of the (3 and 5-protons. 
Further molecular mechanics studies are also needed to 
validate our analysis on measurabality of rotational iso- 
mer populations, which is presented in the concluding 
subsection of this results and discussion section. There 
we analyze the measurability of the gauche + gauche + ro- 
tational isomer population based on the assumption that 
the ratio of gauche+ to trans or gauche - x 1 rotational 
isomer populations is small. The assignments presented 
in the experimental section also rely on this assumption. 

The accuracy of molecular mechanics predictions of 
the relative populations of x 1 rotational isomers depends 
on achieving the correct balance of at least three energy 
terms: the steric energy of syn-pentane interactions, the 
energy of 10 to 15 degree compensatory rotations of the 
leucine x 1 an d X 2 side chain dihedral angles, and the 
energy of ring puckering associated with the rotation of 
the leucine <p> and ip backbone dihedral angles. Two gen- 
eral considerations suggest that these three energy terms 
are correctly balanced in the present molecular mechan- 
ics calculations. First, syn-pentane interaction account- 
ing correctly predicts the observed rotamer preferences 
of protein side chains [28| . This implies that compen- 
satory rotations of the side chain dihedral angles don't 
significantly deminish the importance of the sj/n-pentane 
effect in predicting rotamer preferences and that the bal- 
ance of the first two of the three above energy terms is at 
least qualitatively correct in our calculations. Second, en- 
ergy minimization of pentane structures with CHARMM 
parameters [48} . which are very similar to those we em- 
ploy, quantitatively reproduces the conformational ener- 
gies observed experimentally or predicted by ab initio cal- 
culations [28| . This suggests that the balance of the first 
two of the three above energy terms is also quantitatively 
correct in our calculations. It remains to establish the ac- 
curacy of the last of the above three energy terms, the en- 
ergy of ring puckering associated with the rotation of the 
leucine <p and ip backbone dihedral angles. The molecular 
mechanics parameters of the cobalt chelate ring complex 
are expected to play and important role in determining 



the ring puckering energy. Most of the bond length and 
bond angle molecular mechani cs p arameters are known 
from previous crystallographic [301 ] or molecular mechan- 
ics [49j studies of cobalt complexes. At the other extreme 
the torsion angle and improper torsion angle force con- 
stants with cobalt in one of the four angle defining posi- 
tions and charges of the nitro and cobalt atoms are not 
much better than order of magnitude guesses. Further- 
more the —2 charge distributed over the cobalt complex 
may introduce a substantial solvation effect [13, HH into 
the ring puckering energy. Indeed the solvation effects 
may be viewed as a fourth energy term that affects the 
accuracy of molecular mechanics predictions of the rela- 
tive populations of x 1 rotational isomers. 

More extensive molecular mechanics studies are cer- 
tainly needed to establish the effect molecular mechan- 
ics parameters and solvation have on the relative isomer 
populations. However, some simple tests of the present 
molecular mechanics suggest we aren't too far off. The 
molecular mechanics energy map over x 1 x X 2 torsion 
space of the leucine side chain is not very sensitive to the 
values of the uncertain parameters. The relative energy 
well depths of the leucine side chain rotational isomers 
vary by less than about one half kcal/mol when the tor- 
sion angle and improper torsion angle force constants in- 
volving cobalt are scaled down to zero as a group or when 
the distance independent dielectric constant equal to one 
is replaced by a distance dependent dielectric constant 
equal to the inverse atomic separation in angstroms. 



B. Effect of intramolecular motions 

For most leucine side chain rotational isomers the ef- 
fect of thermal motions on the NOESY cross relaxation 
rates is several times smaller than the typical accuracy 
of these measurements and the effect on vicinal coupling 
constants is perhaps several times bigger than the ac- 
curacy of the best homonuclear coupling measurements. 
The effect of thermal motions on side chain vicinal cou- 
pling constants is similar in magnitude to the previously 
reported effect on backbone coupling constants [3]. The 
magnitude of the thermal motion effect is estimated by 
comparing the calculated average NMR observables to 
those values calculated at the average x 1 an d X 2 tor- 
sion angles. These averages are taken over the individual 
energy well regions, see methods, and thus the compar- 
ison looks at the effects of fast thermal motions within 
the energy wells as opposed to the effects of slower in- 
terconversion of rotational isomers. The trans gauche + 
and gauche - trans rotational isomers, which are the pre- 
dominantly populated isomers, are typical. For these two 
rotational isomers the RMS differences between the av- 
erage observables and the observables at the average are 
0.16 Hz for the vicinal coupling constants and 0.0011 
s _1 for the NOESY cross relaxation rates, where these 
RMS differences are averaged over these two rotational 
isomers and over the 10 NOESY cross relaxation rates 
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and 8 vicinal coupling constants listed in Table HT1 These 
differences are substantially increased for rotational iso- 
mers with more anharmonic x 1 x X 2 torsion space energy 
wells. The difference between the energy well minimum 
position and the average x 1 and X 2 torsion angles gives a 
rough measure of the anharmonicity of a rotational iso- 
mer energy well. By this measure the trans gauche - en- 
ergy well is the most anharmonic (compare Fig. [5]) with a 
difference of 12 degrees between the minimum and aver- 
age positions. For the trans gauche" 1 " and gauche - trans 
energy wells these differences are only 3 and 4 degrees. 
For the anharmonic energy well of the trans gauche - ro- 
tational isomer the RMS differences between the average 
observables and the observables at the average are 0.31 
Hz for the vicinal coupling constants and 0.0021 s -1 for 
the NOESY cross relaxation rates, which are both almost 
twice the values for the more nearly harmonic energy 
wells of the trans gauche + and gauche - trans rotational 
isomers. 

The difference between the average observables and the 
observables at the average x 1 an d X 2 torsion angles only 
accounts for about half the effect of thermal motions on 
the NMR observables. Thermal shifting of the average 
X 1 and x 2 torsion angles of a rotational isomer also sig- 
nificantly changes the NMR observables. This thermal 
motion effect is measured by the difference between the 
NMR observables at the x 1 x x 2 energy well minimum 
and at the average x 1 and X 2 torsion angles. As already 
noted in the last paragraph the difference between the 
minimum and average positions of the trans gauche" 1 " and 
gauche - trans energy wells is 3 and 4 degrees. For these 
rotational isomers the RMS differences between the ob- 
servables at the minimum and average positions are 0.17 
Hz for the vicinal coupling constants and 0.0013 s -1 for 
the NOESY cross relaxation rates. These differences are 
about the same as the values of 0.16 Hz and 0.0011 s -1 
reported in the previous paragraph for the differences be- 
tween the average observables and the observables at the 
average. In the case of the anharmonic energy well of 
the trans gauche - rotational isomer the thermal shifting 
effect is more dramatic. The RMS differences between 
the observables at the minimum and average positions 
are 0.75 Hz and 0.0041 s -1 , which are about twice the 
differences between the average observables and the ob- 
servables at the average for the trans gauche - rotational 
isomer. 

All these thermal motion effects can be put into per- 
spective by comparing them to the differences in the 
NMR observables at the x 1 x X 2 energy well minimum 
and at ideal geometry x 1 and x 2 torsion angles. For 
the trans gauche + and gauche - trans rotational isomers 
the RMS differences between the minimum energy and 
ideal geometry observables are 0.50 Hz for the vicinal 
coupling constants and 0.0039 s -1 for the NOESY cross 
relaxation rates, which is about three times the size of the 
thermal motion effect. This is just about what might be 
expected because the energy minima of the trans gauche" 1 " 
and gauche - trans rotational isomers differ by 9 and 12 



degrees from the ideal geometry positions in x 1 x X 2 tor- 
sion space and these differences are three times the dif- 
ferences between the average x 1 an d X 2 torsion angles 
and the positions of the energy well minima. 



C. Necessity of molecular mechanics energy 
estimates 

At the present accuracy of Karplus equation calibra- 
tions it is not possible to calculate the populations of all 
9 rotational isomers of the leucine side chain from only 
the NMR data in Table HO Before fitting the NMR data 
a small set of rotational isomers with nonzero population 
must be selected with molecular mechanics calculations 
or conformational analysis. If we calculate the population 
of all 9 rotational isomers by fitting an 8 parameter model 
(Table |TTT] row 1), then the populations estimates range 
from 0.0 to 0.3, though most of them are near 0.1, and 
the population error estimates from the moment matrix 
range from about ±0.2 to ±0.3. The standard deviations 
of the Monte Carlo probability density functions are all 
close to ±0.1. These errors are somewhat smaller than 
suggested by the moment matrix because the moment 
matrix estimates do not take into account the nonnega- 
tivity constraints on the isomer populations. The errors 
given by either set of error estimates are larger than or 
at least nearly as large as the population estimates. This 
indicates that fitting the 9 rotational isomer model gives 
meaningless population estimates. 

There are 2 9 — 1 = 511 possible nonempty subsets of 
the set of 9 rotational isomers. As we will detail shortly, 
these rotational isomer subsets generate a large number 
of distinct solutions to the problem of fitting the experi- 
mental NMR data. As an alternative to fitting all the ro- 
tational isomer populations, we might hope to find among 
this large set of solutions one best solution that includes 
only a small number of rotational isomers and that has a 
uniquely high goodness-of-fit to the NMR data. A solu- 
tion is initially generated from each subset of rotational 
isomers by fitting the experimental NMR data with the 
model that includes only the isomers in that subset. In 
general the populations of these included isomers are not 
all positive because active nonnegativity constraints force 
some populations to be exactly zero. A different set of 
experimental data would yield a different set of active 
constraints and thus a different set of positive popula- 
tions. In the next subsection we fit multiple Monte Carlo 
simulated NMR data sets to calculate population prob- 
ability distributions. But here we fit only the actual ex- 
perimental data and generate one single solution from 
each subset of rotational isomers. Two different subsets 
of isomers may generate the same solution with the same 
positive isomer populations on some common subset of 
the two original subsets of isomers. Thus the number of 
unique solutions is substantially smaller than the num- 
ber of subsets of 9 rotational isomers. A single unique 
solution is conveniently identified by the positive isomer 
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TABLE III: Rotational isomer populations. Each row shows the fit populations for a model that excludes the dotted rotational 
isomers. All population estimates are in percent with the error in the last digit given in parenthesis. 
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"The goodness-of-fit Q is the percentage probability that chi-square exceeds its fit value. 

'For models marked MC the population errors are calculated from a Monte Carlo simulation of the NMR observables. 
The population errors of models without Monte Carlo error analysis are derived from a least squares moment matrix. Only 

the average population error is reported in the last isomer column. 
d If a rotational isomer marked with an uparrow is added to the model in this row, then a nonnegativity constraint will fix the 

rotational isomer population at zero. Because these constraints are NMR observable dependent, they are only reported for 

models without Monte Carlo simulated NMR observables. 

The (tg + ,g — t) model is among the 5 two rotational isomer models with a goodness-of-fit better than 1%. 

The (g+g+,tg + ,g~t) model is among the 15 three rotational isomer models with a goodness-of-fit better than 10%. 

9 For models marked A the NMR observables are calculated by Boltzmann weighted averaging over the energy well of each 

rotational isomer. The NMR observables of all other models are calculated at the energy well minimum of each rotational 

isomer. 

The (tg+,g"t) model appears again with improved goodncss-of-fit among these 5 two rotational isomer models with a 
goodness-of-fit better than 1%. 

Tor models marked R the assignments of both fi and 5-protons are reversed. 

The ellipsis marks indicate that the g"*~g^ measurability simulations do not generate any additional measure of goodncss-of-fit 
because they are only indirectly based on the experimental NMR observables. 
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populations and the isomers with positive populations 
are referred to as the populated isomers. 

For the assignments in Table [H experimental data in 
Table HH and with the NMR observables calculated at 
the x 1 x X 2 energy map minimum positions the 511 iso- 
mer subsets generate 278 unique solutions. One half of 
these solutions have a goodness-of-fit better than 10% 
and two thirds of them have a goodness-of-fit better than 
1%. There are 5 solutions that have only two populated 
isomers and have a goodness-of-fit between 10% and 1% 
and 15 solutions that have three populated isomers and 
have a goodness-of-fit better than 10% (Table IIIII rows 
2 through 21). Apparently many good solutions exist 
even with the restriction to solutions that have only a 
small number of populated isomers. Worse, the solutions 
with two or three populated isomers are inconsistent and 
over-fit the experimental data. As a group, the 5 good 
solutions with two populated isomers give 5 predictions 
of the population of each of the 9 rotational isomers. The 
only consistent predictions are for the gauche - gauche - 
and trans trans rotational isomer populations, which all 5 
solutions predict are zero. The gauche" trans rotational 
isomer population is predicted 4 times in the range 0.5 
to 0.6 and once at zero. The other 6 rotational isomer 
populations are each predicted once in the range 0.4 to 
0.6 and 4 times at zero. These positive and zero pop- 
ulation predictions are inconsistent because the positive 
population errors of the 5 solutions are all around ±0.05. 
The 15 good solutions with three populated isomers give 
a similar picture. The gauche - trans rotational isomer 
population is predicted 11 times in the range 0.3 to 0.5 
and 4 times at zero. The other 8 rotational isomer pop- 
ulations are each predicted in the range of 0.3 or 0.4 
from 2 to 7 times and otherwise at zero. Again these 
population predictions are inconsistent because the pos- 
itive population errors of the 15 solutions are all around 
±0.08. The experimental NMR data is over-fit in the 
sense that the predicted isomer populations depend on 
the model and the discrepancies between these predic- 
tions are much larger than the errors estimated from the 
fit of a single model. 

The number of solutions with two or three populated 
isomers, the goodness-of-fits, population predictions, and 
error estimates reported in the last paragraph change 
very little if the average NMR observables are fit instead 
of those at the energy map minimum positions, compare 
rows 2 through 6 and 22 through 26 of Table EU In- 
consistent and over-fit solutions also result if the j3 and 
5-proton assignments in Table U are reversed. For ex- 
ample, for the reverse assignments the 511 isomer sub- 
sets generate 285 unique solutions with an overall pattern 
of goodness-of-fits similar to the assignments in Table [J 
There are 4 solutions that have only two populated iso- 
mers and have a goodness-of-fit better than 10% (Table 
lllll rows 27 through 30). The gauche" 1 " gauche" 1 " and trans 
gauche" 1 " rotational isomer populations are predicted 2 
times in the range 0.4 to 0.6 and once at zero. Four 
other rotational isomer populations are each predicted 



TABLE IV: Karplus coefficients. The vicinal coupling con- 
stants are given by the Karplus equation 3 J(cj>) = Acos 2 <j) — 
B cos 4> + C, where <j> is the torsion angle between the atoms 
indicated in the first column. The coefficient units are Hz. 
Throughout this work these coefficients are used to fit ro- 
tational isomer populations to experimental vicinal coupling 
constants. 
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c Breitmaier and Voelter |52i | . Wasylishen and Schaefer fRef. I53I p. 
963 and Ref.|Hp. 2711). 
''Fischman et al. (Ref. 53 Table III). 



once in the range 0.4 to 0.6 and 3 times at zero. All 4 
solutions predict that the populations of the remaining 3 
rotational isomers are zero. These population predictions 
are inconsistent because the positive population errors of 
the 4 solutions are all around ±0.05. Again there is no 
best solution and the solutions that have only a small 
number of populated isomers are over-fit. At present the 
only hope for obtaining a reasonable solution is to nar- 
row down the number of possible isomers with the help of 
molecular mechanics energies or conformational analysis. 



D. Measurability of rotational isomer populations 

An analysis of all the NOESY cross relaxation rates 
and vicinal coupling constants listed in Table [TT] and the 
Karplus coefficients listed in Table IIVI confirms the pre- 
liminary analysis in the experimental results section that 
the cobalt dipeptide leucine side chain predominantly 
populates the trans gauche" 1 " and gauche - trans rota- 
tional isomers in approximately equal proportions. The 
goodness-of-fit of this simple two rotational isomer model 
with the NMR observables calculated at the x 1 x X 2 en ~ 
ergy map minimum positions is 0.020 for the assignments 
given in Table [Q and 2.2 x 10 -3 if both the (5 and 5- 
proton assignments are reversed (Table IIIII rows 31 and 
32). These goodness-of-fits increase to 0.063 and 0.010 
if the average NMR observables are fit instead of those 
at the energy map minimum positions (Table IIIII rows 
33 and 34). For the assignments in Table U the gauche - 
trans rotational isomer predominates with a population 
of 0.625 ± 0.043. Switching to average NMR observ- 
ables has no effect on this population or uncertainty, they 
both increase by only 0.001. The reverse assignment ap- 
proximately reverses the populations, but again does not 



change the uncertainty. 

Analysis of both the protein data bank and the molec- 
ular mechanics energy map suggests that the third most 
populated rotational isomer after trans gauche" 1 " and 
gauche - trans is gauche" 1 " gauche" 1 ", see the section on 
chelate ring conformation at the beginning of this results 
and discussion section. The gauche"*" gauche" 1 " rotational 
isomer population is expected to be small, perhaps less 
than 5 or 10%. Because the 4.3% population standard de- 
viation that is given by fitting the two rotational isomer 
model is as large if not larger than the probable popula- 
tion, it is unlikely that the gauche" 1 " gauche" 1 " population 
can be measured by fitting the NMR data. If a three 
rotational isomer model that includes gauche" 1 " gauche" 1 " 
is fit to the NMR data, then the gauche" 1 " gauche + pop- 
ulation is 0.245 ± 0.078 with a 0.18 goodness-of-fit (Ta- 
ble IIIII row 35). Though this gauche + gauche" 1 " popu- 
lation mean is high, the population distribution is not 
inconsistent with the expected small population. The 
distribution gives about a 5% probability that the pop- 
ulation is smaller than 10% and about a 1% probability 
it is smaller than 5%. Note that these probability esti- 
mates must be taken with caution because, as pointed 
out in the previous section, even models with three pop- 
ulated rotational isomers over-fit the experimental data. 
Indeed the high population mean seems to further sug- 
gest that the gauche" 1 " gauche + population is poorly mea- 
sured by fitting the NMR data with the three rotational 
isomer model. For the reverse assignments the gauche" 1 " 
gauche" 1 " population reaches the extremely implausible 
level of 0.385 ± 0.078 with a 0.62 goodness-of-fit (Table 
IIIII row 36). The high goodness-of-fit for the three re- 
tainer model with and without the assignments reversed 
again suggests that the assignments in Table [I] must be 
taken with caution. 

The prominent populations of the trans gauche" 1 " and 
gauche - trans rotational isomers are best estimated by 
fitting experimental data. The minuscule populations of 
the other seven rotational isomers, except for gauche" 1 " 
gauche" 1 " , are best estimated from the molecular mechan- 
ics energy map. This leaves the gauche" 1 " gauche" 1 " rota- 
tional isomer on the awkward borderline of experimental 
measurability. The standard Monte Carlo procedure [HH 
can be altered to estimate the gauche" 1 " gauche" 1 " pop- 
ulation and its probability distribution (Table IIIII row 
37) . Initially the gauche" 1 " gauchc + population is fixed at 
zero, the simple two rotational isomer model is fit, and 
Monte Carlo NMR observables are generated, but then 
these Monte Carlo observables are fit to a three rota- 
tional isomer model that includes gauche"*" gauche"*". This 
differs from the standard procedure because one model 
generates the Monte Carlo observables and a second dif- 
ferent model is fit to these Monte Carlo observables. 
The resulting Monte Carlo probability density func- 
tions (Fig. [3]) give population estimates of 0.355 ± 0.053 
and 0.612 ± 0.045 for the prominent trans gauche + and 
gauche - trans rotational isomers and 0.033±0.046 for the 
gauche + gauche"*" rotational isomer. The prominent rota- 
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FIG. 3: Gel graphic of rotational isomer population proba- 
bilities. The graphic shows Monte Carlo probability density 
functions for constrained linear least-squares parameter esti- 
mates, where the parameters are the rotational isomer prob- 
abilities. Each gray scale step of the stepwedge bar corre- 
sponds to a two-fold change in probability density. The least- 
squares fit has NMR measurement, Karplus equation coeffi- 
cient, and molecular mechanics geometry errors and has the 
gauche + gauche + rotational isomer population fixed at zero. 
The gauche + gauche"*" probability density function shows the 
unmeasurable population range. 



tional isomer population estimates have slightly smaller 
means and larger standard deviations than the estimates 
given in the last paragraph for the fit of the simple two 
rotational isomer model. The Monte Carlo probability 
density function of the gauche"*" gauche"*" rotational iso- 
mer is actually the density function of a mixed discrete 
and continuous distribution [56j . The probability of hav- 
ing a population of zero is 0.471, which corresponds to the 
fraction of least-squares fits with an active nonnegativ- 
ity constraint on the gauche"*" gauchc + population. The 
continuous part of the probability density function has a 
population mean of 0.062 and has a roughly exponential 
distribution. The overall gauche"*" gauche + population 
mean is as expected near zero because the two rotational 
isomer model, which generates the Monte Carlo NMR ob- 
servables, lacks gauchc + gauche"*" rotational isomer. The 
mean of the continuous part of the probability distribu- 
tion is greater than 5% and suggests in perhaps a more 
direct fashion than fitting the three rotational isomer 
model as discussed in the last paragraph that gauche"*" 
gauche" 1 " populations in the 5% range can not be mea- 
sured by fitting the NMR data with currently available 
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FIG. 4: Gel graphic of rotational isomer population proba- 
bilities with Karplus equation coefficient and molecular me- 
chanics geometry errors removed. The least-squares fit has 
only NMR measurement errors. Note the ten-fold expanded 
vertical scale, which places the prominent trans gauche + and 
gauche - trans rotational isomers off scale. All other rota- 
tional isomers populations are fixed at zero. Their probability 
density functions show the unmeasurable population ranges. 
The forbidden gauche + gauche - rotational isomer is not in- 
cluded. 



models. The extent of the continuous part of the gauche + 
gauche" 1 " probability density function is determined by 
the observation errors incorporated in the least-squares 
design matrix and observation vector. Though this ex- 
tent has nothing to do with the 2.1 kcal/mol relative 
energy of the gauche + gauche + rotational isomer deter- 
mined by molecular mechanics or the accuracy of this 
energy, it places gauchc + gauche + population in a molec- 
ular mechanically realistic range and approximately cap- 
tures the uncertainty in the molecular mechanics energy 
well depths. In this sense the Monte Carlo procedure 
described here blends together molecular mechanics and 
experimental NMR data. 

The observation errors of the least-squares fit are dom- 
inated by the errors in the predicted NMR observables 
due to uncertainty in the Karplus coefficients and molec- 
ular mechanics geometries. What would happen if the 
errors in the predicted NMR observables could be re- 
duced below the level of the experimental measurement 
errors? Given that the errors in the predicted NMR ob- 
servables are about an order of magnitude larger than 
the experimental measurement errors, the population er- 
ror estimates should also be reduced by about an or- 
der of magnitude into the 0.5% range. A population of 



0.5% corresponds to a relative rotational isomer energy 
of around 3 kcal/mol. Excepting the prominent trans 
gauche + and gauche - trans and the forbidden gauche" 1 " 
gauche - rotational isomers, the remaining rotational iso- 
mers have energy map minima ranging 2.1 kcal/mol for 
gauche" 1 " gauche + to 3.9 kcal/mol for gauche - gauche - . 
A 0.5% population accuracy potentially places all rota- 
tional isomer populations except that of the forbidden 
gauche" 1 " gauche - rotational isomer within reach of ex- 
perimental measurement. The measurability of the pop- 
ulations of all these rotational isomers can be assessed 
by the same Monte Carlo procedure applied in the previ- 
ous paragraph to assess gauchc + gauche + measurability 
(Table IIIII row 38) . Again we fit the simple two rota- 
tional isomer model with all observation errors included. 
The Monte Carlo NMR observables are generated with 
only the experimental measurement errors and an eight 
rotational isomer model that excludes only the forbid- 
den gauche" 1 " gauche - rotational isomer is then fit to the 
Monte Carlo observables. In the initial model all rota- 
tional isomers except the prominent trans gauche + and 
gauche - trans rotational isomers are fixed at zero popu- 
lation. As a result the Monte Carlo probability densities 
(Fig. [4j of these rotational isomers are the density func- 
tions of mixed discrete and continuous distributions with 
zero population probabilities ranging from 0.52 to 0.89 
and population means ranging from 0.0005 to 0.0023. 
The Monte Carlo density functions give population esti- 
mates of 0.3712±0.0038 and 0.6203±0.0051 for the trans 
gauchc + and gauche - trans rotational isomers. The con- 
tinuous parts of the probability density functions have 
population means ranging from 0.0035 to 0.0067 and have 
roughly exponential distributions. This seems to confirm 
that if experimental measurement was the only source 
of error, then rotational isomer populations as small as 
about 0.5% could be measured. 



V. CONCLUSIONS 

This study the cobalt glycyl-leucine dipeptide side 
chain rotational isomer populations gives a realistic pic- 
ture of their measurability and in particular suggests that 
the population of the gauche" 1 " gauche + rotational isomer 
is less than 5 or 10%, which is below the limit of mea- 
surability at the present accuracy of Karplus equation 
calibration. Better calibrations of the Karplus equations 
with model systems such as cobalt dipeptides promise 
to push the limit of measurability of side chain popula- 
tions down into the 1% range. To calibrate the Karplus 
equations to an accuracy substantially better than ±1 Hz 
probably requires a separate Karplus equation for each 
vicinal coupling constant, for example, the 3 Jh„h<3 cou- 
plings of the leucine side chain would require calibrating 
two Karplus equations, one for the coupling to the (31- 
proton and a second for that to the /32-proton. Each 
Karplus equation has three coefficients, so the calibra- 
tion of the two 3 JhqH^ Karplus equations could require 
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that as many as six coefficients be determined. As many 
as four of these six coefficients could be determined from 
the temperature dependence of the two vicinal coupling 
constants [3{|. However, it is probably best to attempt 
to determine only three of these coefficients by adopt- 
ing the following compromise. Because the torsion angle 
<f> between the a and either /5-proton is always syncli- 
nal or antiperiplanar the Karplus equations only need to 
be really accurate within these angular ranges. Reason- 
able accuracy in these ranges can probably be achieved 
with only two adjustable parameters per Karplus equa- 
tion, say by adjusting the A and C coefficients and fixing 
the coefficient B at the best literature value. Note that 
these coefficients are defined in the background section 
and that B — (J ap — J pp )/2, where J pp and J ap are the 
coupling constants at <j> = and 180 degrees. A further 
reduction by one in the number of coefficients could be 
achieved by assuming that J pp is the same for both (3- 
protons. With these compromises there are only three 
coefficients to be determined from four measured quan- 
tities, that is, two coupling constants and their temper- 
ature variations. By measuring a half-dozen or so cou- 
pling constants about the C Q -C^ and C^-C 7 bonds of 
the leucine side chain there would be more than enough 
measured values to calibrate all the Karplus equations 
and determine the unknown rotational isomer popula- 
tions. Fitting the temperature dependence of the cou- 
pling constants implies a search for not only the rota- 
tional isomer populations, but also for a breakdown of 
the free energy differences into enthalpic and entropic 
contributions. The entropic contribution to the the pop- 
ulation differences can probably be estimated more ac- 
curately from molecular mechanics [l8j than by fitting 
the temperature dependence of the vicinal coupling con- 
stants. Before attempting to calibrate Karplus equations 
with the cobalt glycyl-leucine dipeptide the critical is- 
sue of assigning the and (5-protons must be addressed 
by preparing sam ples with stereoselectively deuterated 
leucine side chains [47| ■ The L-amino acid leucine stereos- 
electively deuterated at the C s position is available from 
Cambridge Isotope Laboratories, Inc., 50 Frontage Road, 
Andovcr MA 01810. It is also important to measure the 
cobalt glycyl-leucine dipeptide side chain rotational iso- 
mer populations with experiemnts that are completely 
independent of any Karplus equation calibration. Triple 
quantum filtered nuclear Overhauser effect spectroscopy 
(3QF-NOESY) or tilted rotating frame Overhauser effect 
spectroscopy (3QF T-ROESY) experiments give torsion 
restraints from cross correlated relaxation [57| and could 
supply this independent corroboration of the population 
estimates. 

In addition to calibrating Karplus equations with the 
cobalt glycyl-leucine dipeptide and other cobalt dipep- 
tides we suggest parallel studies of 7V-acetyl-L-leucine 
A^'-methylamide (leucine dipeptide),which is the leucine 
analogue of the alanine dipeptide [Fl], HH. The cobalt 
dipeptides achieve a single backbone conformation by 
forming two approximately planar chelate rings. This 



backbone conformation is uncommon in proteins and 
destablizes the normal side chain conformational prefer- 
ences of leucine. The leucine dipeptide should eliminate 
this disadvantage of the cobalt glycyl-leucine dipeptide 
without sacrificing the significant advantage of a single 
backbone conformation. Indeed the alanine dipeptide 
CVeq backbone conformation predominates in weakly po- 
lar solvents [59(. This strongly suggests that the leucine 
dipeptide backbone would adopt the CV cq conformation 
in solvents such as acetonitrile or chloroform, which 
would in turn strongly favor the gauche ~ trans conforma- 
tion of the leucine side chain. Parallel studies of cobalt 
dipeptides and alanine dipeptide analogues would also 
reveal the extent to which cobalt chelation influences the 
vicinal coupling constants about the C Q -C^ bond. 

A molecular graphic with multiple superimposed struc- 
tures is perhaps the most common format for reporting 
the presence of multiple conformations in an NMR or 
crystallographic protein structure. It is very tempting 
to suppose that the structures displayed in these molec- 
ular graphics are correct in every detail and that struc- 
tures or side chain conformations not present in these 
molecular graphics are extremely unlikely. Gel graphics 
such as those presented here for the leucine side chain of 
the cobalt dipeptide could supplement common molecu- 
lar graphics and give a more realistic picture of protein 
conformational distributions . 



VI. METHODS 

We adopt the convention that a side chain torsion an- 
gle is gauche~ in the range —120 to degrees, gauchc + in 
the range to 120 degrees, and trans in the range —180 to 
— 120 or 120 to 180 degrees. Torsion angle definitions and 
atom names are specified by the IUPAC-IUB conventions 
and nomenclature [60{ . The j3 and (5-protons are also 
identified accor ding to the Cahn-Ingold-Prelog nomen- 
clature scheme [61| for substituents on the C 3 and C 7 
prochiral centers. The torsion angle between two atoms 
across a single bond is synclinal in the range —90 to —30 
or 30 to 90 degrees and antiperiplanar in the range 150 to 
180 or —180 to —150 degrees [62|]. These last terms are 
very convenient for describing the geometry of vicinally 
coupled spin systems. 



A. NMR experiments 

Barium[glycyl-L-leucinatonitrocobalt(III)] was pre- 
pared as previously described [Hj]. NMR spectra were 
recorded at 500 MHz on a Bruker AMX-500 spectrome- 
ter. Pure absorption 2D NOESY spectra were obtained 
by time-proportional phase incrementation [63| at mix- 
ing times of 50, 100, 200, 400, and 800 ms and the cross 
relaxation rates were determined by one parameter lin- 
ear least-squares fits of the initial build-up rates of the 
peak volumes [64| . The standard deviations of the cross 
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relaxation rates were estimated from the one element 
least-squares moment matrices. Cross relaxation rates 
within one standard deviation of zero were set equal to 
zero. Both homonuclear and heteronuclear vicinal cou- 
pling constants were derived from ID NMR spectra. The 
homonuclear couplings were analyzed [65l |(36| with the 
LAOCN-5 program (QCPE #458). 



B. Simple models for preliminary /3-proton 
assignment 

The goodnesss-of-fits of the two alternative assign- 
ments were compared for two simple models for the ex- 
perimental coupling constants across the C Q -C^ bond. 
The two rotational isomer model included only the 
gauche - and trans x 1 rotational isomers and the three 
rotational isomer model included all three x 1 rotational 
isomers. For both models the experimental data was lin- 
ear least-squares fit with the population sum constrained 
to one so that the first model had one population pa- 
rameter and the second two parameters. The torsion 
angles between the coupled spins were assumed to have 
ideal synclinal or antipcriplanar values of magnitude 60 
or 180 degrees. The Karplus coefficients for the 3 JhoH/? 
and 3 JcH/j couplin g co nstants ( Table HV]) were specifi- 
cally calibrated [42|, |44| for these couplings in peptides 
without any correction for cobalt chelation effects. The 
predicted coupling constants were assumed to have er- 
rors of 1.5 Hz to accommodate uncertainty in both the 
Karplus coefficients and geometry. 



C. Molecular mechanics 

Energy maps on x 1 x X 2 torsion space and cobalt 
glycyl-leucine structure coordinates were calculated with 
the CHARMM molecular mechanics program [67]. The 
CHARMM structure file internal data structure was 
generated from custom made topology and parameter 
CHARMM input files. The custom topology file con- 
tained nonstandard glycine and leucine residues to gen- 
erate the peptide portion of the cobalt dipeptide complex 
and a patch to add the cobalt and three nitro groups. The 
molecular mechanics atomic charges were assigned by a 
simple scheme. First the side chain and backbone atomic 
charges were set to those of a glycyl-leucine zwitterion. 
The net dipeptide charge was then —0.6 because one N- 
terminal amine proton and the backbone amide proton 
were removed to make cobalt bonds. On the assump- 
tion that the nitro group charge magnitudes should be 
somewhat less than unity and the cobalt charge should 
be slightly positive we then assigned a charge of —0.6 to 
each nitro group and 0.4 to cobalt to give the correct 
total charge of —2.0 to the complete cobalt dipeptide 
anion. Electrostatics interactions were computed at a 
dielectric constant of 1.0 and without any distance cut- 
off, except where otherwise noted. Specialized force field 



parameters for the cobalt dipeptide ring system were 
introduced by giving nonstandard atom type codes to 
all the peptide backbone heavy atoms except for the 
leucine a-carbon atom. Bond lengths and bond angles 
were taken from a glycyl-glycine cobalt crystallographic 
structure [3(J. Bond length and bond angle force con- 
stants were taken from the force field of a polyamine 
cobalt complex [49]. The force constants of torsion an- 
gles with cobalt in one of the four angle defining positions 
were guessed by adopting the torsion angle force con- 
stants of standard peptide backbone atoms with roughly 
matching orbital hybridization and bonding. A glycyl- 
glycine cobalt crystallographic structure [68| gave the 
initial Cartesian coordinates of the cobalt dipeptide ring 
system. Other ligand heavy atoms of this same structure 
gave initial coordinates for the three nitro nitrogens. The 
nitro oxygens were then built so that the plane of each 
nitro group was in a staggered orientation with respect 
to the other cobalt ligand bonds as viewed down each 
nitro cobalt bond. An internal coordinate representation 
of the leucine side chain was setup during generation of 
the CHARMM structure file. The torsion angle internal 
coordinates were set to ideal values defined in the topol- 
ogy file and the bond lengths and angles were filled in 
from the parameter file. The molecular mechanics en- 
ergy map over x 1 an d X 2 torsion space was computed 
by editing these two torsion angle internal coordinates, 
building the side chain Cartesian coordinates from inter- 
nal coordinates, restraining the x 1 and X 2 torsion angles 
with an energy constant of 400 kcal/mole-rad 2 , and en- 
ergy minimizing by the steepest decent method for 20 
steps followed by the adopted basis Newton- Raphson (6?| 
method for 200 steps. This sequence of edit, build, re- 
strain, and minimize was repeated for all 5184 points on 
a 5 degree grid in x 1 and X 2 torsion space. The map 
was output from CHARMM as a list with each line con- 
taining the x 1 and x 2 coordinates and energy at one grid 
point. The Cartesian coordinates of the energy mini- 
mized dipeptide were temporarily output as a trajectory 
file with one trajectory file coordinate set for each grid 
point. Interatomic distances and torsion angles required 
for modeling cross relaxation rates and vicinal coupling 
constants were extracted from this temporary trajectory 
file with the CHARMM correlation and time series anal- 
ysis command. To account for rotational averaging of the 
cross relaxation rate to a methyl group, the three inter- 
atomic distances from the methyl protons to the other 
cross relaxing atom were inverse sixth root mean inverse 
sixth power averaged with CHARMM time series manip- 
ulation commands. Each interatomic distance, averaged 
interatomic distance, and vicinal spin torsion angle was 
output from CHARMM as a separate file with one dis- 
tance or angle on each line and one line for each grid 
point. 

Compared to the CHARMM 22 developmental topol- 
ogy and parameter files for proteins with all hydrogens 
our custom topology file has similar backbone charges 
and side chain charges of about one half the magnitude. 
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Our peptide parameters are generally similar to those in 
the CHARMM 22 developmental parameter file, except 
that we only define a single tetrahedral carbon atom type 
that does not depend on the number of bonded hydro- 
gen atoms and our bond angle potential does not have 
Urey-Bradley f6i| interactions. 

A separate FORTRAN program [lj| partitioned the 
molecular mechanics x 1 x X 2 energy map into energy 
well regions. The program employed a cellular automata 
that adjusted the regions so that the boundaries passed 
through the energy map saddle points and followed along 
the ridges leading up to the tops of high energy peaks. 
The program named each well, assigned an index to each 
well, arranged the indices in a conventional order, and 
output a new energy map such that each output line spec- 
ified the energy and well index of one grid point. This 
cellular automata program is available from the first au- 
thor of this paper. 



D. NMR observables and Monte Carlo simulations 

We calculated NMR observables (both NOESY cross 
relaxation rates and vicinal coupling constants), fit rota- 
tional isomer probabilities, ran Monte Carlo simulation 
of probability distributions, and generated graphics with 
the MATLAB software package. To accomplish these 
tasks we carefully designed and wrote a library of 36 func- 
tion files containing about 1600 lines of MATLAB script. 
These functions passed all variables explicitly through in- 
put and output argument lists and made no references to 
global variables except in one minor instance of a function 
passed through an input argument list. Important infor- 
mation, such as spin assignments, isomer names, NMR 
measurement names, Karplus coefficient selections, and 
X 1 x X 2 torsion space grid point coordinates, was passed 
explicitly from low level definition routines back to high 
level I/O routines to minimize the possibility of mixing 
up array index definitions. 

The NMR observables were calculated by a function 
file that input a list of NOESY cross relaxing protons and 
vicinally coupled spins, opened appropriate CHARMM 
distance or angle data files, which are described in the 
mechanics methods subsection, and output a matrix of 
cross relaxation rates and vicinal coupling constants, 
where the matrix had one column for each NMR observ- 
able and one row for each distance or angle listed in the 
CHARMM files. To calculate a cross relaxation rate the 
molecular mechanics interproton distance read from the 
CHARMM data file was raised to the inverse sixth power 
and multiplied by the average of glycine geminal a-proton 
and leucine geminal /3-proton scale factors, where each 
scale factor was equal to the experimental geminal pro- 
ton cross relaxation rate times the sixth power of average 
molecular mechanics distance between the geminal pro- 
tons. The glycine and leucine geminal proton relaxation 
rates were 0.39 ± 0.01 s" 1 and 0.42 ± 0.01 s" 1 respec- 
tively. The geminal proton scale factor varied from 13.5 



to 13.7 A 6 s _1 depending on whether the geminal pro- 
ton distances were averaged over structures at all energy 
map grid points or just over the structures of the nine en- 
ergy minimized rotational isomers. To calculate a vicinal 
coupling constant the function file selected the Karplus 
equation coefficients based on the names in the input 
list of vicinally coupled spins and inserted the molec- 
ular mechanics vicinal proton torsion angle read from 
the CHARMM data file into a Karplus equation with 
the selected coefficients. The Karplus coefficients for the 
3 </h q H/3 and 3 Jc H fl coupling constants were specifically 
calibrated [42|, |3| for these couplings in peptides with- 
out any correction for cobalt chelation effects. The coef- 
ficients for the 3 Jh/3H t coupling constant were those sug- 
gested [111 H(| for the sec-butyl fragment without correc- 
tion for the extra carbon substitution on the 7-carbon. 
The coefficients for the ff^-C^-C^-C 7 and C^-C^-C 7 - 
H 7 heternuclear coupling constants were taken from a fit 
to theoretical coupling constants calculated for propane 
[52T [53l . [HI ■ These Karplus coefficients are summarized 
in Table ED 

The molecular mechanics energy, interproton dis- 
tances, vicinal proton torsion angles, and NMR observ- 
ables were all computed on a 5 degree x 1 x X 2 torsion 
space grid. The average x 1 ancl X 2 torsion angles and 
average NMR observables of each rotational isomer were 
computed by Boltzmann weighted summation over each 
energy well region in x 1 x X 2 torsion space. To aver- 
age the x 1 an( l X 2 torsion angles over an energy well 
region these angles were referenced to the minimum en- 
ergy grid point so that they varied continuously in the 
range —180 to 180 degrees in this energy well region. Wc 
assessed the effect of thermal motions by comparing the 
average NMR observables of each rotational isomer with 
the observables at the average x 1 x X 2 torsion angles, 
at the x 1 x X 2 ener §y map minimum position, and at 
the ideal geometry x 1 an d X 2 torsion angles. The NMR 
observables at the average torsion angles were computed 
by interpolating the observables between x 1 x X 2 torsion 
space grid points with a bicubic spline. The energy map 
minima positions were approximated by the minima of 
the interpolating function. To assess the accuracy of this 
approximation we repeated the molecular mechanics en- 
ergy minimization at the minimum energy grid point in 
each of the nine energy wells, except that during the last 
100 steps of adopted basis Newton-Raphson minimiza- 
tion the x an d X 2 torsion angle restraints were released. 
The x 1 an d X 2 torsion angles of these minimized unre- 
strained structures differed from the interpolated energy 
map minimum positions by less than about one half de- 
gree for all the energy wells except for the highly an- 
harmonic trans gauche~ and trans trans energy wells, 
where the torsion angles differed from the interpolated 
positions by about one and one half degrees. Given these 
small differences in x 1 an d X 2 torsion angles we assumed 
that the differences between the distances, angles, and 
NMR observables interpolated to the energy map min- 
ima positions and those calculated from the minimized 
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unrestrained structures would also be small. We arbi- 
trarily decided to calculate the NMR observables at the 
X 1 x X 2 ener gy map minimum positions from the mini- 
mized unrestrained structures rather than calculate them 
by interpolating the observables between x 1 x X 2 torsion 
space grid points. 

To find the rotational isomer probabilities we mini- 
mized the difference between the experimentally mea- 
sured and predicted NOESY cross relaxation rates and 
vicinal coupling constants subject to the constraints that 
the probabilities were nonnegative and that their sum 
was one. The design matrix was formed by calculating 
the NMR observables for each rotational isomer as de- 
scribed in the last two paragraphs, arranging these ob- 
servables in a matrix with one row for each observable 
and one column for each rotational isomer, and dividing 
each matrix element in each row by the observation er- 
ror for that row. The observation vector was formed by 
dividing element-wise the column vector of experimental 
measurements by the column vector of observation er- 
rors. The observation errors were the RMS average of 
the experimental measurement errors (Table |TTJ) and er- 
rors in the predicted NMR observables due to uncertainty 
in the Karplus coefficients and molecular mechanics ge- 
ometries. The predicted NOESY cross relaxation rates 
were assumed to have uncorrelatcd errors of 0.01 s" 1 and 
the predicted vicinal coupling constants were assumed 
to have uncorrelated errors of 1.0 Hz. The linear least- 
squares with linear constraints problem was converted to 
the equivalent [70l ] quadratic programming problem and 
solved by an active set strategy [711 ]. 

The accuracies of the fit rotational isomer probabili- 
ties were estimated from Monte Carlo probability den- 
sity functions and from the diagonal elements of a mo- 
ment matrix. We setup an unconstrained linear least- 
squares subproblem with the number of rotational iso- 
mer probability parameters equal to one less than the 
number of inactive nonnegative probability constraints. 
The desired moment matrix was obtained by transforma- 
tion [721 ] of the subproblem moment matrix to generate 
matrix elements for the probability parameter that was 
removed to enforce the probability sum constraint. The 
probability density functions of the fit rotational isomer 
probabilities (parameters) were computed by the stan- 
dard Monte Carlo recipe (55|: the experimental NMR 
observables were fit to yield fit parameters and fit NMR 
observables, the fit parameters and fit NMR observables 
were assumed to be the true parameters and the error 
free experimental NMR observables, random errors were 
added to the fit NMR observables to give simulated NMR 
observables, these simulated NMR observables were fit to 
give simulated fit parameters, the previous two steps were 
repeated may times and the resulting large set of simu- 
lated fit parameters was histogramed to form the Monte 
Carlo probability density functions of the fit parameters 
(rotational isomer probabilities). To keep the simulated 
NMR observables nonnegative the random errors were 
drawn from appropriately truncated Gaussian distribu- 



tions. These distributions were generated by a simple 
acceptance-rejection method, that is, if one sample drawn 
from a standard Gaussian distribution would have given 
a negative value to a particular NMR observable then 
that sample was discarded and a new sample was drawn 
and tested in the same way. When the standard devi- 
ations of the Monte Carlo probability density functions 
were significantly smaller than the fit rotational isomer 
probabilities, that is, when none of the fit rotational iso- 
mer probabilities were near zero, the Monte Carlo stan- 
dard deviations were almost exactly equal to the moment 
matrix standard deviations. 



E. Gel graphics 

Monte Carlo probability density functions were dis- 
played as gel graphics, which were designed to visually in- 
dicate both the discrete probability fraction at zero pop- 
ulation and shape of the continuous probability density 
over the range of population from zero and one. This was 
accomplished by a simulated photographic process where 
the degree of film overexposure indicates the probability 
fraction at zero population and continuous gray tones 
represent the continuous part of the probability density. 
The continuous part of the probability density was pre- 
filtered to reduce the noise from the Monte Carlo sam- 
pling. The prefilter consisted of two cycles of alternating 
extrapolation and Gaussian smoothing. The first cycle 
estimated the probability density at zero population by 
evenly extrapolating the probability density about zero 
population and then smoothing. In the second cycle the 
original probability density was oddly extrapolated about 
the just determined probability density at zero popula- 
tion and then smoothed. The standard deviation of the 
smoothing in the first cycle was half that in the second cy- 
cle. We speak of this second cycle standard deviation as 
the prefilter standard deviation. This somewhat cumber- 
some prefilter procedure smoothed the continuous part 
of the probability density without introducing distortion 
near zero population, where the probability density typi- 
cally has a positive value and a nonzero slope. Note that 
this positive value is distinct from the discrete probabil- 
ity fraction at zero population, which is not prefiltered. 
The prcfiltcring eliminated noise from the Monte Carlo 
sampling, which would otherwise show up as distract- 
ing transverse stripes across the gel lanes. We exam- 
ined the convergence of the probability density functions 
by repeating Monte Carlo simulations with increasing 
numbers of steps and comparing conventional probabil- 
ity density plots and gel graphics. The prefilter standard 
deviation was adjusted to remove Monte Carlo sampling 
noise without visibly obscuring the shape of the proba- 
bility density. For simulations of length 10 3 , 10 4 , and 
10 5 steps the appropriate standard deviation expressed 
as the full width at half max (FWHM) was 32, 16, and 
8 histogram bins, where 1024 bins covered the popula- 
tion range from zero to one. At 10 3 steps substantial 
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fluctuations in the shape of the conventional plots were 
observed, but the gel graphics had at least qualitatively 
converged to their final appearance. At 10 5 steps only 
tiny differences were observed in the conventional plots 
and the differences in the gel graphics were impercepti- 
ble. The gel graphics presented here were created from 
Monte Carlo simulations of 10 5 steps, even though simu- 
lations as short as 10 3 steps would be adequate for many 
purposes. 

After prefiltering an initial image was formed with the 
probability density functions of the rotational isomers 
displayed in lanes 1024 rows long by 64 columns wide. 
In this initial image the values across each row were con- 
stant and equal to the probability of finding the popula- 
tion in bins of width 1/1024 covering the range zero to 
one, except that the values across bottom row of each 
lane contained the discrete probability fraction at zero 
population. The probability of finding the population in 
a given bin is proportional to the bin width and inversely 
proportional to the total number of bins, but the prob- 
ability fraction at zero population is independent of the 
bin width. Because of the relatively high resolution of 
the image the probability fraction at zero population is 
about an order of magnitude larger than the probability 
of finding the population in another bin. For this rea- 
son the probability fraction at zero population can be 
effectively displayed as a film overexposure. 

To simulate film overexposure at zero population and 
smooth the lane edges along the continuous part of the 
probability distribution a Gaussian blur filter with a 
FWHM of 16 pixels was applied to the initial image. 
With this amount of blurring the typical probability den- 
sity at zero population was still considerably greater than 



that along the continuous part of the probability dis- 
tribution. The pixel values of the blurred image were 
treated like scene luminances [l3[ and converted into pho- 
tographic print densities. First, the maximum printable 
luminance L m was set equal to the maximum probabil- 
ity in the continuous part of the probability distributions, 
that is, excluding the probability fractions at zero popu- 
lation. The logarithm of the luminance L was converted 
to density with a characteristic curve [zj] given by (3x 2 — 
2a; 3 )101og2, where x = l + (logi-logL m )/(101og2) and 
< x < 1. Note that the maximum point-gamma of our 
characteristic curve is 1.5. Then the printable densities 
were linearly mapped into gray scale values. A stepwedge 
bar of the 11 zones in the Zone System [75[ was added 
to the gel graphic as an aid to calibrating the probability 
densities. 



F. Supporting information available 

Molecular mechanics, data analysis, and gel graphics 
input files and additional data tables [7g|, which are suf- 
ficient to reproduce all the results reported here, are in- 
cluded in this electronic preprint's source file. 
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The supporting information includes CHARMM input 
files to generate energy minimized rotational isomers 
without and with torsion restraints and to extract the 
vicinal coupling torsion angles and NOESY interatomic 
distances, crystallographic coordinates of cobalt dipep- 
tide chelate rings with added nitro groups, cobalt glycyl- 
leucine dipeptide topology and parameter files, MAT- 
LAB version 4 M-files to calculate NMR observables, 
fit rotational isomer probabilities, simulate Monte Carlo 
probability distributions, and to generate gel graphics, 
tables of torsion angles and interatomic distances of the 
energy minimized rotational isomers without torsion re- 
straints and of vicinal coupling constants and NOESY 
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cross relaxation rates calculated at these angles and dis- 
tances. 

The torsion angle list can be downloaded from the 
backbone-dependent rotamer library Web page through 



a hyperlink to the file of dihedral angles for all chains at 
http://www.fccc.edu/research/labs/dunbrack/sidechain.html 



